Mutual Visibility and Information Structure Enhance Synchrony between Speech and Co-Speech Movements
نویسندگان
چکیده
Our study aims at gaining a better understanding of how speech-gesture synchronization is affected by the factors (1) mutual visibility and (2) linguistic information structure. To this end, we analyzed spontaneous dyadic interactions where interlocutors are engaged in a verbalized version of the game TicTacToe, both with and without mutual visibility. The setting allows for a straightforward differentiation of contextually given and informative game moves, which are studied with respect to their manual and linguistic realization. Speech and corresponding manual game moves are synchronized more often when there is mutual visibility and when game moves are informative. Mutual visibility leads to a slight precedence of manual moves over corresponding verbalizations, and to a tighter temporal alignment of speech and co-speech movements. Informative moves counter the movement precedence effect, thus allowing co-speech movement targets to smoothly synchronize with prosodic boundaries.
منابع مشابه
Audio-visual synchrony for detection of monologues in video archives
In this paper we present our approach to detect monologues in video shots. A monologue shot is defined as a shot containing a talking person in the video channel with the corresponding speech in the audio channel. Whilst motivated by the TREC 2002 Video Retrieval Track (VT02), the underlying approach of synchrony between audio and video signals are also applicable for voice and face-based biome...
متن کاملInnovative Speech Reconstructive Surgery
Proper speech functioning in human being, depends on the precise coordination and timing balances in a series of complex neuro nuscular movements and actions. Starting from the prime organ of energy source of expelled air from respirato y system; deliver such air to trigger vocal cords; swift changes of this phonatory episode to a comprehensible sound in RESONACE and final coordination of all h...
متن کاملAn Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition
Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...
متن کاملRobust audio-visual speech synchrony detection by generalized bimodal linear prediction
We study the problem of detecting audio-visual synchrony in video segments containing a speaker in frontal head pose. The problem holds a number of important applications, for example speech source localization, speech activity detection, speaker diarization, speech source separation, and biometric spoofing detection. In particular, we build on earlier work, extending our previously proposed ti...
متن کاملPersian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods
Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...
متن کامل